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TMs report was prepared by Itek Corporation^ 10 Maguire Koad, Lexington 73, Mass- 
acliusettsj under Contract Number AF30( 602 ) -3494 and Project Number 4599. The 
RADC project engineer is Zbigniew L, Pankowiczj RMfm. 



This is a final report and covers the period of work from 1 July 1984 to 30 June 1965, 



This report is intended to document research efforts in linguistic studies for Chinese 

to En^ish machine translation. The objectives to be reached under the ccmtract are 
as follows; 



1 . 



Morphological and syntactical analysis of modem Chinese for machine 
translaMda applications. 



2. 



CompUation of 15,000 Chinese input entries and 15,000 English output 
entries on magnetic tape. 



3. Compilation of linguistic rules in symbolic notation. 



«;■ 






This woric covers a 1-year p^iod of main emphasis on linguistic research in the area 
of morphological and syntactical problems in modem Chinese. As part of a company 
sponsored, independent research effort, programming testing efforts were incorpo- 
rated in the second half of the year to test the validity of the linguistic rales and the 
practicability of the operations for machine application. The company sponsored ef- 
fort. is detailed in Section 2, o and in Appendix B Cif this report, to fulfill the require- 
ments of part VI , paragraph e-3, of the contract schedule. 



The contract research efforts have resulted in the following: 



1. 



A linguistic machine translation system from the Chiccder input of the 

source language to the linguistic rule operations to the output of the target 
iariguage. 



2. 



A m^etic tape and a list of 15,000 Chinese injnit entries and 30,000 Eng- 
lish output entries plus input and ou^ut entries of specialized vocabulary 
of general function words. 



3. 



A list of linguistic rules, in symbolic notation, tiiat is divided into sections 
according to operations specified in this report (the linguistic rules in 
symbolic notation are contained in a separate report, '^Linguistic Rules in 
Embolic Notation," October IS, 1965, Itek Corporation). 
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4, Det^s of progransinlng testing of the Hcguietic rules and of testing of 
operations for these rules., 

The research efforts were mad© by Itek linguistic res^rch personnel under the direc- 
tion of Mrs, Theresa G, Lee of the Chinese Programs Group. Additional authors 
this report are: H, T. Wang» S. C. Yang, and E. L. Farmer. 



Special assistance was rendered by C. B» Burgess of the Computer Sciences De- 
partment. The late D^, Jennings Wong contribtite^-^igsiificantly to the research ef- 
forts of this project in the source and target hmgaage analysis. 



Special acknowledgement is noted here for the assistance rendered by Mr, Zbigniew 
L. Pankowicz and Mr. Wing Y. Hoo, the technical monitors of this project at RADC 
during this year of study. 
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tp inoreass'tte oapaMliSes of a develpjaanQital CMocse to Englis li 
aacto^fflisla^ tiie desi^ of a fetsic linguistic processing system «tl- 

#t^#a Context Associative Method ( GASf> • This technique allows machine 
trahsiaticm tinrcfds^ the use of prc^ammed contextual operaticns. 

The results of tiie r^earch effort are presented In this report and include: ( 1 ) ex- 
I^ation of tiie linguistic processing system, (2 ) morphological and syntactical anal- 
yst, and ( 3 ) Eng^h inflection analysis for Chinese to English machine translaticiiu 
Illustrations showing step lay step linguistic processing are included in this report, 
ReCommendatioiis are presented for refinement and furfiier development of the basic 
linguistic analysis to furt!^ the goal of performing Chinese to English machine traas- 
lafiOD, Ihree appendices are included: an explanation of symbols ^or linguistic rules 
( Ai^)eQdix A) , listings of computer e:q)erimentation (Appendix , and Ustinia of 

verb components in Eh^sh output (Appendix C )• 
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EVALtrATlGN 




i^cMne trs^latipa B&D e^ort descrlbpd in subject TR encompasses the follow- 
ing a^viti^: < 1 ) linguistic ansilysis of source language; ( 2 > compilation of 15, 000 
bilingual entries covering &e field of political science; and ( 3 ) formulation of lin- 
guistic rules for processing of source language data. The machine translation system 
thus constructed uses the Chicoder as an input device and the photostore type equip- 
ment as a central processor! Preliminary results of this R&D effort have been pro- 
grammed and partly tested on the PBP-1 ^stem. 

The linguistic research described in subject PR is oriented toward obtaining early 
practical results for application to nmchine translation. The authors recommend 
furtner linguistic studies in morphology and symtax as a means of improving the Eng- 
lish output. 




Textual Data Handling Section 
Information Processing Branch 




1. INTRODUCTION 



The research efforts for the past year have been devoted to the linguistic studies of Chinese 
to English machine translation. In this course of study, a system has been developed for linguistic 
processing from the input of Chinese text to the ou^t of English. For this process of develop- 
ment, the following major goals have been set: 

1. To orient the linguistic research toward the practtcabiUty of machine appUcation 

2. To develop a practicable system for transferring the source language, Chinese, to the tar- 
get language, English, by analyzing the source language with the use of modern Chinese 
political texts, which cover many morphological and syntactical functions of the soiirce 
language 

3. To formulate a grammatical tag system for Chinese morphology and to formulate a lin- 
guistic rule operation system for Chinese syntax 

4. To formulate an English output system to be compatible with input 

5. To provide, on magnetic tape, 15,000 lexicographic entries in the field of political science 
with appropriate grammatic tags and to add general vocabulary input and ou^ut entries, 
which are utilized in the linguistic processing system 

6. To determine a programming system for the processing from input to output in testing the 
linguistic rhles. 

Our linguistic analysis approach is an attempt to bring the source language and the target 
language closer to each other . The input of the source language and the Quality of translation 
derived from the linguistic analyses are therefore considered from every aspect of the problems 
involved and their resolutions. 




Politic^ science texts such as "Ren Min Ri Bao**®^* and "Hong Qi”*- were studied continu- 
ously duri*:^ the year for a constant sampling of texts to modify or enhance the linguistic study, 
since modification of language structures is constant and the best source of research is the actual 
writing in Machine oriented linguistic analysis texts and general linguistic texts, as listed 

in Section 8, were studied for ideas that are applicable to the present research. 

I- 

The’ first task v;as to determine the major morphological classes through analysis of Chinese 
words. Each major morphological class was given a screes of subt^s that define the ^ammatic 
subtlety of the morphological class. The morphological grammatic tags were modified or redefined 
as research progressed to the S3mtactic level. 

The second task, which was concurrent to the first, vm the syntactical analysis. On this level 
the words w<;re grouped into major phrases and attributive clauses and phrases. These phrases 
and clauses were analyzed with consideration given to the relative importance of their roles in the 
sentence, l^lthin this task the problems of linking major phrases in a sentence and of sentence 
patterns were also considered. 

The third task was to formulate and analyze English inflection tables for. proper translation 

t 

output. The major word classes under consideration for this task were the nominals, verbals, 
adjectivals, and adverbials. 

Durii^ the latter half of this study, applicability of the linguistic analysis to machine oper- 
ations was emphasized, and this recognition of eventual machine implementation continually in- 
fluenced the linguistic processing system composition throughout the rest of the period. 

When the basic theories of morphological and syntactical analysis were established, the pro- 
duction of lexicographic entries and linguistic rules began. It is to be emphasized that all catego- 
rizing of linguistic operations and lexicographic work is based on study of the source and target 
languages and machine applicability. 

This project is an attempt to bring the source language closer to the target language morpho- 
logically, syntactically, and in the relationship of sentence structures to patterns. The Contextual 
Associative Method first looks up ihe process sentence morphologically. The major and attributive 
structures are then segmented for processing within the structures, and the major structures are 
linked for correct processing of output. 



^References may be found in the bibliography (Section 8) of this report^ 



.2. LINGUISTIC PROCESSING SYSTEM FOR CHINESE 
TO ENGLISH MACHINE TRANSLATION 

The linguistic processing of Chinese to English maohine translation is divided into tiie fol- 
lowing steps: 

1. Pperaticn of the Chicoder to generate input tape for the language processor 

2. Lookup of Chinese grammatic ta^ from information on input tape and in dictionary entry 
tabl^ on magnetic tape or disc 

3. Performance of linguistic sentesKJe a n al y sis through the use of linguistic rule tables, both 
progra mmin g and lookup techniques to be us^ in deriving English grammatic tags 

4. Lookup of word stem of the appropriate English translation from English grammatic tags 

5. Lookup of appropriate English forms for the word stem. 

2,1 DESCRIPTION OF CHICODER AND ITS FUNCTION IN MACHINE TRANSLATION 

The Chicoder is a device, designed to encode Chinese characters, that was completed \mder an 
Air Force contract awarded by RADC. It has the same number of keys as the English typewriterj 
all codes therefore refer to the alphabet letters or Arabic numerals on the English typewriter 
keys. The Chicoder has a vocabulary of 10,518 Chinese characters, and it is desired so that 90 
percent of the common Chinese characters can be encoded in three strokes. ■ 

The Chicoder is designed to operate in two modes, English and Chinese. In the English mode, 
the keyboard functions like the regular English ^rpewriter in the lower case. In the Chinese mode 
Chinese characters may be encoded, and the keyboard functions like the English typewriter, in the 
upper case. Punctuation marks and mathematical symbols are in the Chicoder code (Iv). 

Two codes are typed to position a character, and a line of five Chinese characters is dis- 
played bn the screen. The character is selebted by typing one of the position keys designated on 
the lower left combr of the a, is, d, f, and g keys. If the. character is not on the first line, the 
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se^^ing (SEQ) key is depressed for the second line. If it is not thero^ SEQ is dressed for the 
line, and so on» 

lihe ^coding scheme looks at the ishole character as a unit sq;uare divided into qoadranti, 

i.e., 1, li, and IV. The top and bottom character ccaifigurations, or characteristics, are se« 
lected by an lamination of the character stroke patterns contained toereln. The top characier- 

istitm can tmally be found in quadranto I and Ut n and II alone, a^ die bottom ctouracteris- 

< 

tics can usually be found in quadnmts m and IV, IV and ^ or IV alone. The following is an 
ample of a character separated by quadrants: 




To encode die above character, keys P, 7, and 1 (G) are depressed, i.e. 




The keys function as follows: 

1. The P key repr^ents quadrants I and n. 

2. The 7 key represents quadrants m and IV. 

3. The sequencing key identifies die row containing die character. 

4. The Gkey, which contains a 1 in the left hand comer, identifies the column containing die 
character. 

The input to the Chicoder is by means of a Friden Flmcowriter, When depressed each of die 
keys inscribed with stroke patterns generates two unique 6-bit binary codes. The output of die 
Chicoder is a punchy p^r tape, Eiich character is repr^ented by three 6-blt binary codes, or 
characters. The first 6-bit tape character is a unique code representing the iq^r stroke pattern, 
and the second is a unique code representing the lower stroke pattern. The diird 6-bit character 
uniquely identifies the character position in a 5 by 5 matrix. Three bits are used to encode the 
row, and die otoer three bits a*'? used to encode the column. A fourth 6-bit tape character indi- 
cates character end. 

7^ ;puncl^,i^^ tope output of the Chicoder is used as input for language processing from 
Chini^e;io Each Chinese chai^^ iji Chicode, which contains four characters that 

are combinadons of letters md niunbers to make up the lexicographic entries. first character 



&e siBcood e&iiw repn^ects file lowef atrokd pitlani^ fiie 
fifiri|cSllnip^ and fou^ claraeter fite cdlamn. A Chteese 

lsidCi^^^;dde'e&Uy of fiici^e &an €&s Ghicode cas be nsade 19 , i*e», at Chinese word xnay^ be coin* 
DM^ than one CSztese character, ai^ fiie Chicode serves as fiie imique code for each 
chincb^. 

The prlxna^ foncfi^ of CMcoder te Chinese to EngUsh machine traMlation is to generate 
drom an^ giy^ Chinese tBXt, a punched paper hype for input to fiie language processor* Ss seccmd 
function is to pwvideideatilyLng tags for Chinese characters in fiie l«^cographic entries* This is 
•of prime Import in the complete system of language processing, from fiie input of Chinese to the 
ou^iut of EngUsh. 



2.2 DICTIONARY ENTRY FORMAT AND ITS REDATI(»1SHIP TO THE IdNGiJlSTIC 
PROCESSING SYSTEM 

Over 15,000 lodco^zaphie entries consisting of gmieral terms and political scimice terminol- 
ogy have been completed on a linguistic analysis level, first on card form and then on magnetic 
tape. The informafion on each card includes the foliotdiig items: 

1. Romairization (Pinyin) 

2 . Chinese characters 

3. Chicodes 

4. Chinese grammatic tags 

5. EngUsh grammatic tags 

6 . English translation stem 

7. English inflection table tags. 

A sample dictionary entry card is shown in Fig. 2-1 and eiQilained in Section 2.5.3. 

Selection of lexicographic items from all fiie word classes (see Section 3) of' which the entries 
are composed is made with the aid of the "Chinese*Englh»h Dictionary of Modem Communist 
Chinese Usage**® for the political and general terminology, as well as with the fruits of research 
and study of general texts listed in S^fion 8 . 

The lexicographic entries are divided into three dictionaries m, and IV) for linguistic 
analysis and iprogram^tag ppe^ttons. The argument of dictionary I consists of Chicodes, and the 
function consists pi Chih^e gra m matic tags. The Chinese gxammtic tag may contain ambiguities 
such aa V/N (s# 3^2.1), verb ^ noun fag wprds for wMch are listed. Linguistic 

ai^sU^ acC^ the use of lingdstic ride tallies contained Sn dictionary n, ^hich 





will discttss^ later in this section. The argument of dictionary m consists of Cfaieodes and 
English gnuninatic tags for word applicahie for the entry, and the function ccmsists of English 
translation stem and English inflection table tags. The English grammatic tags result from alter- 
ing the original grammatlc tags to the correct English grammatic form through &e processing of 
linguistic rules. The English inflection table tags indicate the aj^rc^riate English ending for 
nouns, adjectives, ^verfae, and verl». ’I^e noun inflection tables give singular and plural for- 

* . * ^ f 

mation and singular and plural possessive formation. Verb inflection tables give tense, infinitive, 
negative, and auxiliary formation. Adjective and adverb inflection tables give regular, compara- 
tive, and superlative formation. For a detailed ^planation of the use of English tag tables, see 
Section 5. 

2.3 LINGUISTIC RULE ENTRY FORMAT AND ITS RELATIOl^EIP TO THE LINGUISTIC 
PROCESSING SYSTEM 

The linguistic rules are the results of the semantic and S3mtactical analyses and of the in- 
corporation of fee analyses into a machine translation system. These rules are grouped into six 
major linguistic passes, and each rule is illustrated in symbolic notation on a card whose left and 
right sides indicate the argument and the function respectively. 

Some examples of linguistic rules are; 

1. AJXXXX} + OA 2 (argument) — ADXXXX, (function) 

2. ADXKOUi + V ~ “/N - (argument) ADXKOUi + V ~ (function) 

The explanation of linguistic symbolic notation is as follows: 

1. Xsan^dhing 

2. / = ambiguity division of tag words 

S. + = division teiween one grammatic tsig and the other 

4. = up to and including the stated number of subtags 

^ 5. -«• = division between argument and function. 

* « 

The subscripts are word order indicators. A complete explanation of linguistic symbols is pre- 
sented in Appendix A. 

E^h major linguistic pass utilizes one or more sets of linguistic rules. These rules are 
matched against the processing sentence to find a match by word sequence. Special programming 
operations wd lookup tectmiques are used to alter the original match (the argument) into the 
^ctibtt. ThhscI Sets of n4hs are dlctibnaiy n in the linguistio processing system. They are 



important in that they furnish tae translation system the refinement of language processing. They 
serve to resolve amWguitiss, to group major phrases and attributive structures, to connect the 
subject with the predicate, and to give inflection. 

2.4 LINGUISTIC PROCES^G SYSTEM 

ta ejjplaining tae linguistic processing systens, we proceed from the input of Chinese. cherac- ’ 
ters to fhe output of English words.- A ty^t takes’ a Chinese text and punches from the Chicoder 
a paper tape, which is used as input for tiie language processor. A search is made to determine 
die beginning and end of die sentence, since the linguistic analysis process preswitiy deals with 
only one sentence at a time. A sentence is indicated by the segmentation indicators KPI, which 
indicates the beginning of the sentence, and KPT, which indicates the end of the sentence. Dic- 
tionary I is utilized and the information from that dictionary for each word of the sentence is 
extracted and read into the active memory. A processing sentence is illustrated as follows: 

• KPI Chicodes/Chin^e grammatic ta^ (first word) Chicodes/Chinese grammatic ■ 
(second word) Chicodes/Chinese grammatic tags (third word) KPT 

The processing sentence is isolated, and Chinese grammatic tags are introduced into the . 
linguistic processing operation. Dictionary n, consisting of linguistic rule operations, is then 
introduced. The sentence is then examined, patterned, and reordered by prc^pramming and table 
lookup techniques. The major programming operations consist of Insertion, deleticm, reordering, 
masking, and phrase s^mentation. The words on which these operations are performed may be 
tag words, English words, Chicodes, or translatimi words. 

The linguistic rule processing is composed of six major linguistic passes, which are different 
from programming passes. The linguistic passes deal with symbolic notation of rules and ex- 
planations of operations for these rules, while the purpose of programming passes is to reduce 
machine time to a minimum by grouping the linguistic rules according to different levels of operation. 

Table 2-1 is a list and e}q)lanation of the linguistic passes. The major goal of pass 1 is to 
utilize as much as possible the classes of words that function linguistically as phrase or structure 
initial or terminal indicators. From these words, major phrases are found and formed into 
syntactical patterns in subsequent passes. Words with more taan one tag word (V/N, A/N, etc.), 
which we call “ambiguities,” are resolved as much as possible, depending on the position of the 
word relative to other tag words or indicators. 

A series of operations is initiated to determine the indicators of verb phrase patterns. In 
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pass 1, reordering of adverbs within the verb phrase and tense and modal verb operations are 
recognized and indicated. The head word of the verb phrase is then singled ou*. with an in^cator. 

hi pass 2, the resulting sequence of tag words is matched against a list of noun phrase pat- 
terns. When a match is found, a noun phrase is recognized and the appropriate head word of the 
phrase is singled out with an indicator. The words within the noim phrs^e are then rearranged 
according to English word sequence. 

Pass 3 detects attributive structures, which include phrases and subordinate clauses, and 
finds the major phrase to which these attributes are related. The attributive structures are then 
isolated so tiiat a S 3 mtactical pattern becomes apparent. 

Pass 4 consists of a series of operations to discover the eicistence of noun phrase-relative 
clauses in the sentence. When a noun plirase-relative clause is found, the word order is rear- 
ranged according to equivalent English word stance. The head word of this structure is again 
singled out and treated os a nominal in the syntactical structure. 

Pass 5 links the head words of all major phrases in the sentence and determines the sentence 
pattern according to linguistic tags of each head word. This p^s completes operations on all the 
tables in dictionary n. 

The final pass, pass 6, selects the grammatically and semantically appropriate English word 
from a group of translations for each Chinese word. This pass utilizes dictionaries ni and IV. 

Hie details of each lin gu i stic pass are further amplified in the following paragraphs. 

2.4.1 Pass 1 

Pass 1 is divided into eight subpasses, ^s lA sections the sentence according to the ex- 
isting punctuation marks. This is utilized to indicate the possibility of initial a nd terminal points 
of phrases or clauses. When a punctuation mark is found, two indicators are usually inserted 
before and after the punctuation mark. The primary phrase segmentation table has two types of 
.punctuation indicators. The KC indicator is inserted before and after punctuation mgykg within a 
sentence, such as comma ^d semicolon, l^e KP indicator is inserted before and after punctu- 
ation marks, such, as question mark, period, paragraph indicator, and exclamation mark, that 
indicate the end of a sentence. For example: 

1. PC KCI + PC + KCT 

2. PP - KPT + PP + KPI 

F^s IB picks out special structures, such as comparative structures or interrogative 
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sentences, v^tiibut'going throng subsequent pass-js, since the syntactical patterns for these 
structures need further analysis. 

Discontinuous structures that are used for subtle shades of me aning for words having no 
appropriate English ©juivalents are scanned and masked. For example: 

• ^ (dark night has ending). 

The more ^propriate translation would be: " The dairk night does have an ending.” For the 
present ^ is masked, since it is treated as a verb, and emphasis is not analyze in this phase oi: 
studyi 

Another example is: 

. ft (brave youths). 

It is more appropriate to use the adjective form of the noun forjj| ... 6^ with certain nouns be- 
tween than to consider "^his as a noun phrase-relative clause. In similar phrases, such as 

‘^^(politieal problems), it is better to use the adjective form of ^(political).. 

Pass 1C picks out adverb independent words that do not affect the rest of the sentence, gnri 
they are masked for subsequent scanning (see Section 3.3). 

In Pass ID, the functions of specific connominal, collocative, and converbal class words are 
determined by examining their immediate environment, and they are indicated accordingly. These 
words can therefore be utilized as phrase or structure initial and terminal indicators. For ex- 
ample, if-^^ is immediately followed by a collocative initial, it is used as a converbal and not as 
a verb. Those words whose functions cannot be decided on until the scanning of a greater struc- 
ture will be dealt with later (see Sections 3.13 and 3.15), 

Pass lE searches for the initial or terminal indicators for prepositional phrases or attri- 
butive structures. When a phrase or structure indicator is found, the KI and KX indicators are 
Inserted before and after phrase or structure tag words such as LIX and CCXXX. A minor in- 
dicator (Of) is inserted after the phrase indicator word if it is an initial indicator or before if it 

h 

i 

is a terminal indicator. Initials and terminals must be equal in number, and ilhey are paired by 
linguistic routines. 

Pass IF deals with the resolving of ambiguities of words with more than one tag word, a 
circumstance that depends on the immediate environment of the word. The rules may be reap- 
plied in the sentence for these ambiguities. For example: 



1. AAXXXXj (adjectltrfe adverb) + OA 2 (^) + Vs = ADXXXXj toerb) + Vj 

2. AJXXXXi + V/Nj = AJXXXXi + N 2 



Pass IG searches for and isolates main verte. A series of operations, including verb 
phrase segmentation, adverb reordering, and inserting tense into verb grammatlc tags, is made. 
The verb is then singled out for subsequent passes (see Section 4.2). 

Pass IH resolves words with more than one tag word by examining their environment. This 
operation may be performed after each major phrase operation is completed. 

2.4.2 Pass 2 

Pass 2 is concerned with the recognition of simple and complex noun phrases and with the 
Isolation of the head word from its attributes in the context. The head word is utilized for sub- 
sequent passes for prepositional phrases, etc. in this pass, the number of the head word is af- 
fected by its attributes, such as arithmates or numerals, that have niunber tags. .Words are re- 
ordered according to equivalent English word sequences. The detailed operation for this pass is 
described in ^tion 4.1. 

2.4.3 Pass 3 

This pass searches for initial or terminal points of collocative structures and connominal 
phrases. When the Itiitial or terminal point is foimd for the structure, indicators are inserted to 
separate the structure from the processing sentence. The elements mthin the structure are re- 
ordered for the proper translation. After this process, the tag words within the structure are 
masked except for the connominal tag word and the initial and terminal structural indicators. The 
connominal tag word is then referenced to the related verb for the proper connominal translation. 
The collocative structures and connominal phrases are then syntactically reordered to follow the 
related verb or verb and object if required. Each of these steps is described in detail in Sections 

4.3 and 4.4. 

2.4.4 l^s 4 

In this p^s, the environment of the noun phrase-relative clause is examined to determine the 
initial and terminal points of the noim phrase-relative claiise. When these are found, initial and 
terminal indicators are insexted before and after the clause to segment the structure from the 
sentence for subsequent operations. To derive the proper English translation of the noim phrase - 
relative clause, tag words within it are reordered and English words are added where necessary. 
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All words are then masked except the head word, Is to be singled out for the verb UtiWng 
pass (see Section 4.5). 

2.4.5 Pass 5 

By this time, all attributive structures have been masked. The modifiers of noun phrases 
(Including noun phrase-relative clauses) and of verb phrases have been mgqk*»d, and only the head 
words of these phrases remain in the processing sentence. At this stage, all attributive structure 
initial and terminal indicators as well as verb phrase initial and terminal Indicators (KXS) are 
masked, with the exception of KCI, KCT, KPI, and KPT. Therefore, only major punctuation initial 
and terminal indicators and head words of major phrases remain in the processing sentence. The 
operations are then initiated to lick the noun head word with the verb head word to give the verb 
person and number. To derive certain operations, a series of rules indicating the lo(*up of the 
next head word is necessary for this phase (see Section 4.6). After this process is completed, the 
processing sentence is ready for the selection of the proper translation according to tiie English 
grammatic tag of individual Cinnese words (Chicodes). 

2.4.6 Pass 6 

In the first five passes, the Chinese grammatic tags for each word are altered to the appro- 
priate English grammatic tags. In pass 6, dictionary m is utilized in matching the appropriate 
English grammatic tags to find the correct stem of the English translation. The correct English 
table tags are used in looking up the ending for the correct form. A set of special cqserations is 
initiated to determine the proper form for the inflection specified in the English grammatic tag. 
Dictionary IV gives the tables for these appropriate forms so that programming operations can be 
specified to generate operations that search for tiie correct endings (see Section 5). Auxiliary 

words and/or endings are then attached to the word stem to generate the English equivalent for 
the Chinese word. 

2.5 COMPUTER EXPERIMENTATION 

The computer used in the present system for Chinese to English machine translation (MT) 
is a Digital Equipment Corporation PDP-1 with auxiliary disc storage. The system utilizes pro- 
grams that allow the application of automatic lookup, content addressed tree structure, and con- 
text associative type techniques. Several special operations have been programmed to facilitate 
a simulated content addressed table lookup method that is associative but allows nestiiag and 
structuring for phrasing, etc. These Initial operations are experimental In nature and are being 
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used Ijoth for pjtoving the approach and for ultimate system design specifications. Chinese inputs 

encoder and a paper tape Is produced. This tape Is processed by 
&e ejqperlmental MT system; oii^ts.are punched on paper tape with subsequent English printout 
on a Friden Flexowriter, 

Three dictionaries are prepared according to the longest match principle to allow a simu- 
lation of content addressing. Dictionary I accomplishes word segmentation and converts Chicodes 
of Chinese words into pseudo-Chicodes (semantic tags) with Chinese (grammatic) tags. Dictionary 
n converts Chinese tags into English grammatie tags through operation of different linguistic 
rules. Dictionary m leads semantic tags and English grammatic tags to a final transUtion in 
English. A set of subdictionaries, i.e., dictionaries for English inflection table tags, will later 
enable automatic derivation of the proper auxiliary forms and word endingSo 

2.5,1 Processing Procedure 

The processing procedure contains three passes using three dictionaries respectively. 

Pass 1 — Chicodes to Semantic and (h»ammatic Tags. Dictionary I tape is written onto die ' 
disc by an automatic loading program. The tape of Chinese text (in Chicodes) is then read in to be 
processed by the content addressed lookup technique, i.e., when the Chicodes (argument) of a 
Chinese word are associated with an entry in dictionary I, they will be replaced by the corres- 
ponding semantic tags and Chinese grammatic tags (function). The ou^t of pass 1 is thus a 
string of semantic and Chinese grammatic tags. (See A^iendlx B for listings of Chinese text, 
dictionaries, and ou^ts of the experiment.) 

Pass 2 Applicati ons of Linguistic Rules. After dictionary n loaded onto the disc, the 
output of pass 1 is read in to be processed. Arguments of linguistic rule entries are associated 
with the input stream, which is then modified by operation codes Indicated in the functions. An ^ 
output tape of this pass is then produced as the Input of the next pass. The ou^mt of pass 2 is a 
string of semantic tags and English grammatic tegs that are to be associated with the arguments 
in dictionary iXL Special operations are used here in facilitating associative techniques to ac- 
complish tiie linguistic operations. The operations are described and ejqilained in Section 2.5.4. 

Bass 3 Semantic T ags and English Grammatic Tags to English Tragiatetions. After dic- 
tionary in is loaded onto the disc, the ou^t of pass 2 is read la to get the final translation. This 
pass uses only the content addressed lookup technique (with chained stems and endings). The 
ou^ait of pass 3 is the English translation of the original Chinese tort. However, the Ei^lish 
tra n sla t ion ou^Dut for nominate, adjectival adverblals, and verbals appears inthe stem form of the 
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vdcafaolary (s^ Section 5) with suteequent atiachmeht and insertion of proper endings and anxi 



liary forms tbrou^ the use of suhdictiouaries. 

2.5.2 Dictionary Entry 

Dictionary Format. An entry card has been designed in special formats for dictionaries I 
and HL This c^d is arranged for the convenience of direct punching without further coding work 
(see Fig, 2-1). The entry card is divided into three portions. On the top portion of the card, 
blocks on the r^ht are for Chinese characters while blocks on the left are for their romanizatiou 
(Pinyin) letters. The space between Chinese characters and their romanized letters is left for 
indicating word classification, i.e., noun, verb, etc. In the middle portion (for dictionary 
Chicodes of Chinese characters (az^ument) are of the left, and pseudo-Chicode (semantic) and 
Chinese (grammatic) tags (function) are on the right, hi the lower portion (for dictionary nO, 
semantic and English grammatic tags (argument) are on the left while English translations and 
inflection table tags (function) are on the ri^t. 

Chinese Characters. Chinese chaxacters of a word are copied in blocks. They are picked up 



mainly from the “ Chinese -Ertglish Dictionary of Modem Communist Chinese Usage."* Other 
soxirces include articles from “Hong-Qi” magazine, the “People*s Daily,”** and other 
publications. 

Romanization Letters. These are the transliteration of Chinese characters. The romanization 
procedure used is the Pinyin system of Communist China rather than the Wade-Giles sjnstem. The 
"Chinese -English Dictionary of Modem Communist Chinese Usage”* is used as a standard refer- 
ence for romanization Pinyin. 

Chicodes. The Chicode of each Chinese character can be found from Chicoder typing or from 
the reference manual.** Each Chicode contains four characters that are combinations of letters 
and numbers. A slash is used as the delimiter of each Chicode, e.g., ckl2/t711/. 

Pseudo-Chicode (Semantic Tag). For efficient processing, die pseudo-Chicode or semantic 
tag is designed to represent variable length Chicodes of each Chinese word. Each sematic tag has 
a fixed length of six characters that are combinations of letters and numbers with a slash as the 
delimiter, e.g., vaab99/. The semantic tags are very important in the application of lii^istic 

< t , , 

rules during the computer processing, which will be described in Section 2.5.5. 

Chinese Grammatic Tags. Chinese tags of each word have variable lengths due to different 
word classifications, e.g., 10 tags for a noun and 16 tags for a verb. These tags are symbols of 
grammatic analyses of a Chinese word. 
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Ei^h Graim^c Ta^. English tags of each word ^o have variable lengths simUar to 
Chinese ta^. These tags areJSn^h translation grammatic indicators of, e.g., word classifica- 
tion, number, and person. There maj be more than one classification (part of speech) for one 
Chinese word. There may therefore be more than one set of English tags to indicate different 
word classifications. 

Translation. This refers to the English translation of one Chinese word. One Chinese word 
may be used for different word classifications (with different sets of English tags) and one Chinese 
word may therefore have more than one translation. 

2.5.3 Linguistic Rules 

Linguistic Passes. According to the procedure of linguistic analyses, linguistic rules are 
classified into six linguistic passes with some subpasses. (These are not to be confused with the 
dictionary passes, since much of the computer processing is internal in iiature.) They are listed 
and described elsewhere in this section as well as in other sections of this report. Linguistic 
rules are referred to as dictionary n (pass 2 during processing) in the explanation of this experi- 
ment. All l i ngu i stic passes except linguistic subpass 6B are included in processing pass 2. The 
lingisistic subpass 6B, "English word inflection lookup,” is left at this stage. Further effort will 
be made to continue full implementation of table tags for choosing the proper ending of English 
words. 

Writing Forma t of Linguistic Rules. Linguistic rules are written in a format similar to that 
of the chemistry equation, with arguments on the left and functions on the right. Examples of some 
linguistic rules are shown below. All letters and numbers in the equations are either Chinese tags 
of a word (e.g., aa— — -) or linguistic tags for analyses (e.g., kct). Semantic tags of each word are 
not shown in the equation. Subscripts to sets of tags indicate their sequence. Superscripts found 
at the end of some sets of tags indicate the total number of tags in the set. 

1. AAXXXXi + HM2 - AJXXXXi + HM2 

This rule is intended to solve a simple adjectival -adverbial ambiguity. The equation means that 
when there is an adjectival -adverbial ambiguity (AAXXXX) that is immediately followed by a 
special class word (HM), the adjective form (AJXXXX) is used. 

!KCt1 



2 . 



KPl| 



+ $VTIC - i® + UP to NEXTs + AJXXXX4 + HMs + UXg + N ~ 




+ $vnc~? + up TO NEXT3 + UXe + AJXXXX 4 + N~5® 

This example shows the matching of alternative case and discontinued string, the reordering of 
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word sequence, aod the deleting of a word. '.The bracketed ejspression means that either 




KCT or KPI will appear at that positidh. The dollar sign with overscore, $ ^ means that any- 

thing else ^cept what is under the overscore will be applied to this rule. UP TO NEXT repre- 
sents the discontinued portion of die string, and means the connection of two halves of the rule. 
The whole equation means that when either KCT or KPI is immediately foUowed by anything 
else except VTIC ~ and followed by some other words that are in turn followed by a string of 
AJXXXX, HM, U3^ and N ~ the word AJXXXX should be moved to the position between UX 
and N ~ and the word HM should be eliminated. 

3. USt + UP 2 + UPs + UP 4 + NT3SONOOUOs NT3S0N30U0s + of+ VAi+ UAn+ UA, 

This is an example of noun phrase reordering. When there are four numerals (US or UP) that 
are followed by a noun of time (NT3S0N00U0), they should be presented in arithmetic form (UA) 
and moved to die position after the noun with " of” in front of the numerals. After reordering, the 
underlined portion in the function (of + UA + UA + UA + UA) is masked until final translation of 
English. The nominal tag 7 is therefore changed from 0 to 3 to indicate the formation of a nominal 
of time phrase. 

2.5.4 Operation Codes for Contextual Associative Method 

(deration codes indicated in dictionary n will perform the special matching operation and the 
linguistic changes required by translation. These operations will allow the input data to be mani- 
pulated in different ways to obtain the desired form. Ambiguities in word classification will be 
resolved. Noim piirases, verb phrases, and adverbs will be reordered and put into proper position. 
The subject, the main verb, and the object of a sentence will be detected and connected to each 
other. These operation codes are described and explained individually as follows. However, 
examples of their uses will be given later. 

Don t Care Code (-). A hyphen in the argument will associate with any character in the input 
string at that position. One hyphen will match only one character. The hyphens in the argument 
are therefore used as unspecified characters to associate with corresponding characters in the 
input. (The don’t care code is Identical with X in linguistic analysis.) 

yp_yo Next (-»). The right arrow will appear in the argument for connecting two separate 
parts of a rule. This operation is created because some linguistic rules require the association 
of two unrelated strings to proidde sufficient information. The gap in between may contain varia- 
ble lei^ data, and it is therefore not known how many don’t care codes are needed. This up to 
next code will fill the gap and connect the two separated parts to enable a complete association. 
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gave Ismt (,), The period is used in the function to save the associated portion of «ie input 
as iipcafed k the corresponding ardent. This code wifi usuaUy appe^ at the beginning,of the 
functito to alter &e input string. Without this code at the beginning of the function, the associated 
portipn of input will be deleted from the input string. 




Rho Stuffin g ( ). The overscore in the fumjtion will insert characters underneath and fol- 
lowings into the input string. Characters to be inserted start from the one under the overscore 
and end at the one before any operation code or blank. 

Code^U or ^m). The middle dot indicates that the assumed pointer in the input string is 
to be shifted to the right. A letter m immediately following the middle dot wiU change the shifting 
direction to Hie left of the pointer. Numbers following the shift code indicate how many positions 
to be shifted. The assumed pointer is the position in the input string where the next table lookup 
will start. The pointer will appear in the intermediate printout as an underscore ( ). 

Ignore Codes [, ]). There are six sets of ignore codes designed to ignore por- 

tions of the input for future table lookup. These ignore codes are used only in pairs of the same 
code, e.g,, [ and (, < and <, The first ignore code of the pair indicates the iniHai ©f the ignored 
portion while the second ignore code indicates the terminal. One pair of ignore codes could be 

included in another pair of ignore codes if necessary. (The ignore operation is identical with the 
masking function in the linguistic analysis.) 

Masking Code (.). The comma is used in the function to mask or skip the unchanged charac- 
ter in the argument, i.e., the comma in the function will save the corresponding character in the 

argument as it is, therefore retaining it in the input in original form. One comma will skip only 
one character. 

Save Copy (I). The up arrow in the function wiU save part of the argument and cause the 
saved portion to be moved to the desired position. Masking codes usually follow the up arrow to 
indicate the corresponding characters that will be moved. 

Instore Copy (+). A plus sign following a shift operation in the function will restore the part 
saved to the designated position. This operation code is used in conjunction with a save copy code 
and a shift code to accomplish the reordering of data strings. 

]^set Pointer (~). A tilde at the end of function will reset the pointer at the beginning of a 
sentence. This code is often used at the completion of rule application to stop further searching 
of the rest of the sentence, and to restart the searching from the very beginning of the sentence. 
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' <iod^ in ^c^ozuu^ The three 

esam^i^ are broo^t over ip sh^ &e actual coding of individual rules. Hie 

coding ofv&Qi^ 

Argument Function 

aa— --) /bm) 

xl=aa aj ~ 

These two dictionary entries will accomplish the simple ambiguity rule in example 1. The 
hyphen stands for anything, such as a letter, a number, or a symbol. When the argument of the 
first entry is addressed, its function specifies the saving of ^e associated portion of the input 
stream and the moving of the pointer the left of the argument, indicated by a period, and stuff in 
xl- at the location of pointer, indicated by an bverscore. The string in the I/O buffer will dien 
show as follows, with the umierscore designated as the pointer: 

• xl=aa-"— ) /bm) 

The *3econd entry will associate with the above string and replace xl=aa with aj. The pointer 
will be moved back to the very beginning of the processing sentence (indicated by ~). The string 
will show as: 

• aj ^-/hm) 
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The coding of example 2 is: 
Argument 



Function 



ket ( /vtic— )- / 

M 

aj )— — /bm) /\i-) / 

n ) 

kpi( —/vtic )— / 

aj ) /bm) /u-) / 

n ) 

kct(-»- —/aj -/hm) 
u-)— ^/c — ) 

kpi(-* — — /aj )- /hm)— / 

U-)»--i,-wan — -) 



.•999999998 



.•999999998 



.•4t,„„„„^„„^992-f-«m992^0= 

, I,, f ,,,,,,,,, *99tf*(*am992x20s 










x20=— /hm)r 

24 a 







^ m — r - ~ ' - ’^ ^ 4.' -.i ' ■ ■ 1 _<►'«. *_“- 

- rr V -* “ *'■ -*'S'' : "re; ■ >” - '; " * ■ ^ ‘ * 

I -. - 'i,*7- - C'C- - -- . . - r - -J C; i r ' i- ' -»“' - - -^ 

*y ^ -- -X-. ' -Jlf/ ->•_ 

'I %-^i* ^ I-‘ •'T * 

j; • ' - ^ -• - - ^1 - v'- ^ -- ~ - C - r. " 



^ - 
bt 



Example 2 needs five dlctiona^ en^es to accomplish the rather compUcated requirement, 
hi-the first entry, any ix}rtion of thil input str^m meeting diese cemditions will associate with the 
ea^pie 2 rule, which: b^. with KCT + Vnc - Its function specifies the saving of the as- 

sociated input stre^ (indicated by a period) and the shifting of the pointer to the right (indicated 
by a middle dot) by 80 characters (indicated by 999999998), This wUl meet the requirement of 
example 2 that if KCT or KPI is immediately followed by VnC ~ “ and the rest, this rule wiU 
not apply. The second entry is the same as the first entry except that KPI replaces KCT to 
ensure that either case will not be applied. 



Since ejmeptions of example 2 have been taken care oi, the fliird and fourth entries will include 
all cases applicable to this rule. The up to next code (-►) will connect two separated parts of the 
rule. Their hmetions indicate the saving of that portion of the input stream and the shifting of the 
pointer to the ri^t by 4 characters (indicated by • 4) and the moving of the following 14 chai'ac- 
ters (mdicated by f „„„„„„„) to the ri^it by 20 positions (indicated by • 992) and the restoring of 
them (indicated by +). The pohiter is then shifted left by 20 positions (indicated by • m992) and 

x20= is inserted. After operations of this entry, the original string (argument of third or fourth 
entry) will appear as: 



• kct(-*x20s- 



-/hm) -/u-) /aj- — )-r- — /n ^) 



or as: > 



• kpi(,*x20=- 



■/hm) /u-) ^/aj ) /n ) 



The pointer is now at x and the argument in the fifth entry will therefore be addressed. Be- 
cause there is no save cede (.) at the beginning of its function, the argument will be eliminated 
from the data string. The tilde in the function will reset the |K>inter to the beginning of this sen- 
tence for other table lo<&ups. The string after the application of the fifth entry will be shown as: 

• ketC /u-)- — -/aj ) /n ) 



or as: 



• -/u-)— — /aj- — ) -/n ) 



The coding of example 3 is: 



Argument 



Function 



-/ys)- 

-/up)- 



-/up) /up) 

-/ut3s0n00u0) 



.„„„/ua) 

99fffi /ua) /ua) 

>,,,,, /ua)» 99^x22— 
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ri22S^ -^1| 



TSW*; 






x 2 ^ 60 u 0 ) 

x23=— :y---/»t3s0ii30u0)2of2zz/s) 



x24^23b 











30u6)zofzzz/z)em999S23s 

•^^~ti»»»i»»»»»»»»»»f»»»»y3>»»»»»*in99998+»99 

<.99999^4a 

<~ 



This rule of r^rdering noun of time phrases is facilitated by four entries in dictionary IL 
When the argument ci the first entry is matched by the string in the VO buffer, its function indi- 
cates the saving of ^ argument (indicated 1 ^ a period) with masked portion to be protected 
(indicated by commas) and unmasked portion to be replaced. The pointer is then shifted to the 
right by nine and four characters and x22= is Inserted there. After the application of the first 
entry, the string will Show as: 



• — /va) /ua .) — — -/ua) -/ua) — — -/nt3s0n^2=00u0) 




The last portion of th5 string will match the argument of the second entry find is to be re- 
placed by 30u0)zofzzz/z). The i»inter is then moved to the left (indicated by • m) by 9+9+9 char- 
acters to insert x23=. The string show as: 

• V'la) /\m) — /ua) /ua)223=— — ^/nt3s0n30u0)zofzzz/z) 

The portion started from x23= will associate with the argument of the third entry. The function of 
the third entry instructs the saving of the associated portion of the input stream (indicated by .) 
and the moving of the 27 characters (indicated byt that foUow x23- to the left 

(indicated by • m) by 9+9+9+9+S characters. It then instructs to shift the pointer to the right (indi- 
cated hy . ) by 9+9 characters to insert the ignore code (<) and again to shift to the ri^t 
9+9+9+9+9+4 characters to insert x24=. The string will become: 

• — — /nt3s0n30u0)<zofzzz/2)— — /ua)— — /ua)— — /ua)— — /ua)x24=x23» 

The last portion of the above string will associate with the argument of the fourth entry. Its 
function indicates the replacing of the associated input stream segment by another ip»cre code (<) 
and the moving back of the pointer to the very beginning of the sentence (indicated by "'). The 
final ai^arance ol the string wiU be: 

• — — — /nt3s0n30u0)<zofzzz/z)— — /ua)— — -/ua)— — 7 /ua)— — /ua)< 

2.5.6 Conclusion 

To give a precise picture of computer processing results, a two-sentence paragraph of 
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Chinee Is chosen as an example for illustration. The Chinese text is prejxired and, processed 
according to the procetees described above. The Chinese text in both Chinese and Chicode^^ 
three dictionaries, ai^ outputs of each pass are listed in Appendix B. However, for easier under- 
s^nding, dictionary n (linguistic rules for the sample paragraph) is listed in linguistic writing 
format rather dian in the computer pseudocodihg lai^uage. 

Briefly, this ex^rimentation has successfully adapted the linguistic analyses to computer 
implementation of language processing. This developed and tested machine translation system is 
rot a simple word by word machine translation but will utilize and apply accumulated efforts of the 
linguistic research to give an ou^ut closer to timt of human translation. The quality of the ma> 
chine translation can always be improved by enlarging and revising morphological and syntactical 
dictionary entries, which are the direct result of the linguistic research. 

The software system employed is open ended in every sense, in that rules can be added for 
exceptional cases and rules can be added to apply to greater context without requiring special 
programming and without causing any conflict (unless it supersedes a rule). 

Although this system is deseed for Chinese to English translation, its principle ai^ metho- 
dol(^ could be effectively applied to the inverse translation, i.e., English to Chinese, or to other 
language translations, e.g., Russian to English or German to English. 




3. MORPHOLOGICAL ANALYSIS 



An intensive study of the morphology of the moriern Chinese language was made during the 
past year. In the course oi this study, many factors in the Chinese language were considered. 
Monosyllabic as well as polysyllabic Chinese words were examined for their functions in relation 
with other words. Both semantic and word class ambiguities were taken into account Inflections 
for the three primary morphological classes (nominals, vferbals, and adjectival adverbials) were 
considered in relation to monosyllabic and polysyllabic words whose definitions are enhanced by 
these inflections. The relative importance of punctuation marks in the segmentation of phrases 
and clauses was examined. In the composition of morphological ciasses, ample room was allowed 
for adjustments, modifications, and additions. 

Each major morphological class has a series of subtags, and each subtag gives specific gram- 
matic information for a particular word in that class. For example, the ad jectiv*al -adverbial class 
has subtags that denote degree, tense, and lype of word modified by adverbials. Up to this point, 

14 morphological classes Iiave been defined, and room for addition of subtags to each class and for 
formulation of more word classes has been aUowed. Tag 1 is the designation for each major 
morphological class, the subtags for which are illustrated in the general table (l^ble 3-1) and 

explained in this section of the report. The specific tables for the Chinese morphological tags are 
shown in Tables 3-2 through 3-8. 

3.1 MORPHOLOGY TAGS AND EXPLANATIONS 

The 14 word classes outlined in the general and specific tables for Chinese morphological 
tags are utilized in making lexicographic entries. Specifically, the major word class and its sub- 
tags are listed as Chinese grammatic tags for each entry in the dictionary format. Each Chinese 
word may, of course, have one or more v/ord classes for morphological analysis. Table 3,-9 gives 

the syrabox, the terminology as used in our linguistic analysis, and the general definition for each 
major word class (tag 1). 
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Table 3-6 -— Specific Table for ihe Conncmiidl Class Tags 

■» 




Tagl- 


Tag2 — 


Tags — 


Tag4 — 


Tags — 


KiikI 


Type 


Regularity 


Position 


Quality 


1 = conhominal 


A=iSJ ,1^, ^5^ 


K = regular 


R = preverb 


N = followed iy 




4 

B =% , 


I = irregular 
A = with special 


? = postverb 
B = H and P 


noim or noim 
phrase 




C = 


adjective 


I = independent 


V = N, or followed 
by verb (present 




D=v^ 


C = comparison 


of verb 


{^ticiple) and 






connominal 


A = prsadjective 


noun 




F = 4g,^| 

G = ^ 

® » 
i?* «, 

I=X^ 




or comparison 


S = clause or' notm 
phrase 



J -dif, -J. ^ 

K = ^ 

i- = -5^-7 

N= 

o ^ if ,1p^, ftK. 

«9 = #S'J^,t^4. 
R = i^h,kLfe. 

jt-ia' ff'J-V 

T = K 

«=?4.t. 5-A. 
w=iiii..sl4l 






Talkie 3-7 ~ Specific Table for the Converbel Class Ta^ 



Tagi- 

KiM 



Tag2 — 
Type 



Tag3 — 
Subtag 



Tag 4 — 
Tense 



= converbal N = n^ative 



I = Initial 



P = future 



P = present 
participle 

T = tense 
indicator 



T = terminal 
B = IandT 



P = present 
R = progressive 
E = perfect 



A = passive 
voice 
indicator 

R = relative 
clause A 
C^/f) 



I = initial 
T = terminal 



0 = none 



I.= important A = 0 = none 

verb 
indicator 

C= ^ 




C = complement 3 = untranslatable 0 = none 

6 = translate as 
it is 



Table 3-8 — Specific Table 

Tag 1 — Tag 2 — 

Kind Type 

L = collocative I = initial 



for the Collocative Class T^s 

Tag3 — 
Translation 
Classification 

A= ft:, a etc. 

.etc. 

C = AA , dl( , .etc. 



D = ^ , etCv 

T = terminal A - , etc. 

B=4’-t ,-5^1^, etc. 
C=^t,;^T',etc. 

B = .4t etc. 

, etc. 

F etc. 
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Table. 3-9 Symbols, Terminology, and General 
Definition for Major Chinese Word Classes 



Symbol 


Terminology 


General Definition 


N 


Nominals 


Nouns (A. ^j) 


A 


Adjectival 


Major attributes of nouns 




adverbials 


-and verbs 


V 


Verbals 


Verbs (It . 


L 


Ccllocatives 


■ Prepositions of discontinuous 






structure (^. . ..1 ,an. . .,U6) 


R 


Arithmates 


Demonstratives 


S 


Synonomes 


Classifiers and measurements 








U 


Numerals 


Numbers ( , Z . vj? :^) 


J 


Conjunctions 


Equal conjunctions j./ ^ . 


E 


Converbals 


Words contributing to verbs in 
regard to tense, aspect, voice. 






negation (6 ^1, 


G 


Auxiliaries 


Auxiliai y verbs 7^ 


0 


Con-adjectival- 


Words tiiat formulate adjectives 




adverbials 


or adve; bs 


P 


Punctuation 


Punctua.ion marks (. ; .) 


I 


Connominals 

i 


Preposiil ions ^ 


H 


Sp 'cial words 


Special 'vords (6^. 



il 



r 



? I 

> <1 
f il 



I] 

L ifl 

' <1 



^ I 

i 



i- I 
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3,1,1 ISeneral Tag Terminology 

The following p^agraphs briefly explain the general tag terminology. 

T 3 ype, The of a word include subclasses that denote either semantic or grammatic 
qualities of differentiation. For example, nominals are differentiated according to meaning; they 
fall into the classes of hu m a n, country, animal, abstract, concrete, place name, etc. Verbs are 
classified according to grammatic function, such as transitivity with nominal object, transitivity 
with both direct*and iniJirect object, and intransitivity. 

Position. The positions of a word include subclasses that denote its possible relative positions 
between words. For ex^ple, coUocative initial precedes collocative terminal Ci converbal 
initial precedes the verb, and converbal terminal follows the verb (*I^, ). 

Number. Number classification is tagged in accordance with the English translation as to 
singular and plural, since number inflection is not shown in Chinese. Won' classes such as nomi- 
nal, arithmate, and numeral have number tags. 

TranslataMlity. Synonomes have translatability and untranslatability. For example, is 
untranslatable, while (kind) is translatable. Some adverbs, such as and.^]^, which are 
presently difficult to translate because they change their meanings in different contexts, ai*e tem- 
porarily treated as untranslatable. In further linguistic analysis of this category, it is hoped that 
appropriate English S]mtactic structures that transfer the exact meaning of these words will be 
found. 

Specific Word Indicators. Con-adjectival-adverbial, connominal, and special (H) classes of 
words (e.g.,^, and ) have specific tags, so that no specific Chicodes are needed ! 

to identify them, 

! 

Person. Nominals and verbals use person tags for English person verb inflections in refer- 
ence to first, second, ana third person, since Chinese has no verb person inflection. j 

I 

Tags of specific interest will be referred to and explained in «1etail in the discussion of vari- j 

ous word classes and of the iitilization of information tags. f 

I 

S.2 NOMINALS | 

The nominals have ten information tags, shown in Table 3-3, that include Chinese and English 
grammatic tags. l[n the study of the nominals, several problems were considered. The first | 

problem involves the fact that differentiation must be made among pure nominals, verbal/nominals 
(V/N), and adjectival/nominals (A/N). Analyses of nominals in terms of semantic differences 




f 
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were undertaken, and nominais were grouped into for proper translation in their various 
relationships with, other words Studies were made concerning: (i) nominals that, when preceded 
by (SQil)* and followed by (VFll), can be changed into English adjectival forms, and (2) 
nominals that, when followed by-i- , can be charged into E 2 iglish adjectiVcLl or adverbial forms. 

4 

The problem of utilizing one-character words that have functions in specific word classes, 
but may also serve as family names and therefore require romanization, was also taken into 
account. Words such as (president) and (chairman), which can also be used as 

titles for proper names, were analyzed for the purpose of assigning appropriate subt^. 

There is also the problem of categorizing nominals that function as tense indicators or as 
coUocative phrase endings. Person, number, capitalization, d traiislatability of the nominal 
class were analyzed in the study of nominals. 

3.2.1 Nominal Tag 1 — Kind 

The most common kinds of nominal ambiguities are V/N and A/N, vliich are different from 
pure nominals; pure nominals have only the nominal grammafic function. Two examples of pure 
nominals are: 



1. ^ (people) 
2- ^ (war)* 



A V/N is a nominal that can also function as a verb. Three examples of V/N’s are: 




1. jL. (work) 

2. (represent, representative) 

3. (organize, organizatton). 

A V/N is different from a pure verbal with an English equivalent noun form. For example, in 
(his coming),;^ (come) is not a V/N, but a pure V that has an English noun 
i!orm. Usually, ^ (HM) is not needed to make a V/N function as a nominal, while a pure V re- 
(luires that HM precede it, e.g.,-^ ^ '^1^' (Communist organization). 

An A/N is a nominal that sometimes functions as an adjectival. The following, are ejiamples 
Cif A/N’s: 

1. ^ (happy, happiness) 

2. (difficult, difficulty) 

* Codes in parentheses refer to Chicodes for the Chinese character. 
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3. it. ^(health, healthy). 

When functioning as adjectivals, these words must usiiall^ be followed by HM or preceded by 
ADRR» such asl^ (very), ^4^ (very), and ^ (quite). Two examples are: 

1. ^ (difficult problem) 

2. ^ ^ ^ ^ (healthy life)* 

However, when these words function as nouns, they need not be preceded by HM. For example: 

1* St. (economic difficulty) 

2. (psychological health). 

A pure adjectival, however, must have HM preceding it to cause it to take noun form. 

Although words such as /jj (history) and ^4^1 (agriculture) have equivalent English 
adjectival forms (historic, agricultural), they are entered as nominals with adjectical forms. A 
rule causes them *o take adjectival form when they precede a nominal without HM in between: 

• NDi + Ns - A JXXXXi + Ns 
3.2.2 Nominal Tag 2 — Type 

The nominals are divided into eleven subclasses, as follows. 

Pronominal (M). This subclass includes all pronouns, e.g., (you), (I), and (C (he). 




Country and Continent (C). This subclass includes all nominals that are proper names of 
coimtries and continents. Although these are classified as pure nominals, the adjectival forms are 
included in the English translation. Some examples are:^ iCl (U.S., American) and ^ ;9fj 
(Africa, African). 

Mea (D). This subclass includes the designations of theory, point of view, concept, life, com- 
position, etc.— -names of nonconcrete things that are abstractions or generalizations, e.g., 
(editorial), ^ *1^' (conference), and (opposite). This category of nominals sometimes 

utilizes the YD&S tag (tag 7). 

Organization (O). This subclass includes the designations of agency, group, company, etc. — 
nominals that have a collective, active authority to perform specific activities. Some examples 
are: (delegation), (party), and iS} ^ (coimiry). 

Human (H). This subclass includes terms for human individuals. Some examples are: 
(madam), (chairman), and (ambassador). 
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Human CollectiYe (E). This subclass includes generalized names for groups of people or for 

nonspecific persons. The English translation for such words is usually plural in number. Dis- 
cretion should be used in differentiating these nominals from human nominals, which are mainly 
singular in number. Human collective nominals do not xise the address tag (tag 8). Some examples 
are: ^ (people), (workers), ^ (students), and (capitalists). 

Tnanimate (I). This siibclass is defined to include all things that are concrete and neither 
human nor animal, such as stones, trees, places with common names, radio stations, parts of the 
body, and buildings. Some examples are: ^ (house), ^ (clothes), and 
(automobile). 

Beasts (B). This subclass includes all live animals, singular or collective. Some examples 
are: (domestic fowls), (dogs),-^ (sheep), (birds), and ."t^fish). 

Place Names (P). This subclass includes ail proper place names, excluding countries and 
continents. Some examples are: (Berlin), Jl. (Shanghai), and '^(Boston). 

NQTYiinaig of Time (T). This subclass includes special nominals that involve the concept of 
time. These function as tense indicators and utilize the dsfinity tag (tag 9). Some examples are: 
/V (August),'?*^ (today), (year), (present), andB^ ^^(time).* 

Collocative Terminals (K). This subclass includes nominals that are basically nominals, but 
are often used as collocative phrase endings. Some examples are:*^ (aspect) and-^*^ vf? 
(aspect). 

3.2.3 Nommai Tag 3— Person 

All nominals are in the third person except (you), which is in the second person, and (I) 
and As. (I), which are in the first person. 

3.2.4 Nominal Ta.g 4 — Number 

The number tag is classified according to the English translation of the Chinese word. 

3.2.5 Nominal Tag 5— Capitalization 

Proper nominals use P for the capitalization tag; common nominals use 0. 



♦See Section 3.2.9 for specific discussion of the nominal of time. 
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3.2.6 Nominal Tag 6 — Gender 









^11 



Nominals of the subclasses denoting idea, organization, inanimate, continent, place name, 
beast, nominal of time, and coUocative terminal usually use N (neuter) for their gender tag. 
Coimtry names usually use F (feminine). Nominals that are concerned with human beings and are 
not specifically known as either feminine or masculine (M) are all considered to be masculine. 
For example,'^ (students) and (people) use M in the gender tag, butdr-55^-§- (girl) 

and*i^ (women) use F in the gender tag. , 

3.2.7 Nominal Tag 7--YD&S Tag 

Some nominals in the subclass “idea” go inside the phrase ... and become adjectival. 
Some examples are: 

^ (brave) 

^ (significant). 

These idea nominals \ and X.) use Y in the YD&S tag, and the English adjectival forms 
(brave, significant) are included in the English translation. However, single character idea nomi- 
nals such as and'^’l, which also go into the YD phrase and become adjectival, need not have the 
YD tag, and the English adjectival forms need not be included in the English translation. 

Some iiominais followed become adjectival or adverbial, e.g., J:^ (political, 

politically), and (lexicographical, lexicographically). These nominals (i^jL and 

^ ^ ) ® ^ YD&S tag, and the A forms are included in the English translation. 

3.2.8 Nominal Tag 8 — Address 

Some human nominals can sometimes be used as the title of a proper name. For example: 

1. (Mr. Wang) 

2. (Chairman Mao). 

The human nominals and ) use A in tag 8. 

3.2.9 Nominal Tag 9 — Definity 

This tag is used primarily for the nominals of time, which are words concerning the con- 
ception of time. A T is used as indicator in the nominal tag 2, and tag 9 is used to indicate the 
subclasses into which the nominals of time are divided. At present, there are six major subclas- 
ses, as follows. 



'4 




: 




Nominal of Time — Definite. The names of the twelve months of the year, the seven days of the 
week, the different seasons, the different periods of time of day and night, etc., are included in 
this siibclass. A D is put in tag 9 as indicator. When these words are not preceded by other 
nouns of time or followed by either VSl ( ^) or HM (e^ ), they are treated as coUocatlve phra- 
ses at a certain phase of the translation scheme. Some examples of such words are: 

(Friday), 'j’-^(noon),^^(spring), and+— ^ (December). 

Nominal of Time — Lidefinite. Another group of nominals of time has the function of coUocative 
termination. The nominals in this group are classified as indefinite nominals of time rather than 
as coUocative terminals. If one of these words is preceded by a coUocative initial, it functions as 
a terminal. K not, it is a regular nominal of time. An I in tag 9 is used as indicator. Some ex- 
amples of such words are: ^ (time), 4L ^ (day), and ^(at the same time). 

Nominal of Time — Numeral. A few words are used to measure time, and are classified as 
nominal of time numerals. Like other measurement words in the synonome class, when these 
words are preceded by a numeral the resulting phrase can be either a nominal or an adverbial 
phrase. Some examples are: (year), ^ (day), and ^ ^ (week).. 
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Nominal of Time — Relative. There is a group of words that indicate the relativity of time. In 
Chinese, words of this group are used to indicate the time of the sentence relative to the present. 
Such words are classified as relative nominals of time. An R is put in tag 9 as indicator. Since 

9 

these words usuaUy overlap with the adverbial class, they are classified as AD/NTR. The ad- 
verbial tag 5 indicates tense (see Section 3.3). When these words precede any other nominal of 
time, a nominal of time phrase is made and the tense tag is kept. Later the tense tag is redupli- 
cated in the verbal tag 15 to indicate the correct tense inflection of the English verb form. When 
these words precede HM, they take noun form and a regular noim phrase is made. If they precede 

any other word, they take the adverbial form. The rules generated are: 

9 9 9 

1. ADi/NTRg + NTD - NTD? + of + AD, 

9 9 

2. AD,/NTR+ HM + NX 2 - NX 2 + of + NTR, 

9 

3. NX + AD/NTR - NX + AD 

9 

4. AD/NTR + NX - AD + NX 



Some examples of relative nominals of time are: 

(last year) 

2. 0^ ^ (tomorrow) 

3. Hjtu (last night). 
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Nominal of Time Indicator. Certain words, such as Sti^A,D.),*S?.^j5(B*C.), and *i\ ^ 
(A.D.), are used to indicate the time of a ncminal of time phrase. The words are classified as 
nominal of time indicators. AnNisputintag9as indicator. The rule generated is: 

9 8 

• Nn + NPt - NP^ + NTI 



Nominal of Time — O. The NTO word (0) is different from other nominal of time numerals 



because of its irregular relationship to numerals. When this word occurs after a numeral and the 

9 

numeral is preceded by a definite nominal of time, the numeral is to take the UQ form, and NTO 



is to be deleted. An O in tag 9 is used to indicate this word. 



3.2.10 Nominal Tag 10 — Translatability 



This tag refers to the translatability of the entry. A 0 indicates that it can be translated, and 
a U that it cannot be translated. 



3.3 ADJECTIVAL ADVERBIALS 



The adjectival-adverbial class, shown in Table 3-4, includes all Chinese morphemes that 
modify verbs or nouns. This group of modifiers is under the general heading of A, since some 
words can be used to modify both verbs and noims, and word classes would otherwise have to be 
further subdivided into similar subtags. 



3.3.1 Adjectival “Adverbial Tag 1 

This tag is always A to distinguish it from other grammatical classes. 



3.3.2 Adjectival-Adverbial Tag 2 



This tag indicates the major divisions of the A class. Morphemes that are adjectivals, i.e., 
that can be used as attributes of nouns with or without the adjectival indicator er^fEM), or that can 
be attributes of verbs or verb phrases only when followed by the adverbial indicator ;J^^(OA), are 
classified as subclass J. Examples of such adjectivals are: 



4; (lofty) 



2. 5^ (tense) 

(legal) 

4. Jf- 4ff (abject) 

3* ^ A. (unpopular). 



Morphemes that are usually used as attributes to verbs v/ithout being followed by OA, but that 
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must be followed by HM when used as attributes of nouns, are classified as D. Some examples of 
members of this class are: 



!• %j ^yT (adequately) 
(minimum) 
3.4^ (willfully) 
4.^^ — (further). 



Since the linguistic analysis has not yet gone ?uto the level that distinguishes a subordinate 
clause from a main clause in a complex sentence, all subordinate clause indicators such as 
(not only), (although), (if), and '^3-. (and) are temporarily included in this subclass. 



Morphemes that need not have HM or OA to be attributes to either noun or verbs are clas- 
sified as A. Examples of members of this class are: 

1. )ik (ille^) 

2. 4f\ A. (positive) 

3. % ‘^^“^(collective) 

(continuous) 

5* (overall). 



Ws do admit that the above classification depends a great deal on the subjective use of the 
classifier. It is hoped that the criterion of classification is always the structure most commonly 
used in modern Chinese (^H). According to the above classification, the following linguistic 
rules are generated: 



1. AAj + OA2 + V3 ADj + V3 

2. AiJj + OA2 + V3 ADj + V3 

3. AAj + EOM2 + N3 — AJj + N3 

4. ADi + HM2 + No AJi + N3 

5. AA + V - AD + V 

6. AA -h N - AJ + N 



There are some Chinese adjectivals whose English equivalents must go after the noxms they 
modify, hi such cases, the adjectivals are classified as AP (adjectival postal) or AB (adjectival 
postahor adverbial). Some examples are: 

1. (compatriots abroad) 

2 . S'-] (machines in sets) 
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3. ^ ^ is (harvest above norm). 









"^1 



if fr 



When such classes appear, what is best and simplest for the longest match met.od and for ma- 
chine operations is usually taken into consideration. 

There is a group of adjectivals whose members have a very close relationship with certain 
connominals. These adjectivals are classified as special adjectivals (AS) with a completely dif- 
ferent set of tags. Some examples of such special adjectivals are: 

1.^ (far) 

2.i^ (near) 

3. -^ ^-^(same) 

4. -H^(together) 

5* — (together). 

All adjectivals and adverbials take noun form when preceded by HM and followed by a phrase 
segmentation indicator (KXX); 

• HM + A + KXX - HM + N + KXX 

3.3.3 Adjectival -Adverbial Tag 3 

This tag indicates the degree of the adjectival or adverbial. In Chinese the degree indicator 
is a group of independent words, while in English the degree indicator is either a separate word or 
a suffix. Most Chinese adjectivals and adverbials are therefore in regular form in the dictionary. 
In context, when they are preceded by a degree indicator, a rule will combine the two Chinese 
words into one, as follows: 

• ADCi + AJRg -* AJC2 

rrom the English tags, the machine will be able to find the correct form for the English equivalent 
of the two Chinese words. 

3.3.4 Adjectival -Adverbial Tag 4 

This tag is used to indicate the quality of the adverb. Most adverbials are attributes only to 
verbals. A V or a K is used in tag 4. In Chinese, most adverbials precede the verbs they modify, 
and in such cases they can be used as verb indicators. In English, however, some adverbs usually 
go after or before the verbs they modify. The classifier is able to indicate this by choosing a V 
or a K. The following rules are generated for this purpose: 








1 . ADXVi + V2 - V2 + ADXVi 

2 . ADXKi + Va - ADXKi + ^2 
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It is discovered that degree indicators (ADCR, ADSR) are the only adverbs that can modify 
verbs, a^Jectivals, and adverbs. Some examples are: ^ 4i (like most) ^4^ yi 

(better way) and (do better). 

There are certain adverbs that can only modify adjectives (ADRJ), e.g., 

( highly secretive' document). 

There are certain adverbials that do not influence the grammatical structure of the sentence. 
Such adverbials are called “ independent adverbials.” Because of the lack of time, the clause con- 
junctions are also temporarily put into this class. Some examples of such adverbials are: 

1.5+Mi^(for many years) 

2. 4^.^ (but) 

3. Ji, (actual) 

4. >tr#-(all). 

3.3.5 Adjectival -Adverbial Tag 5 

There is a group of modifiers that is used in Chinese to indicate time. Tag 5 is used for such 
purpose. In machine operation, the adverbial tag 5 is moved into the verb tag 15, causing the 
English verb to have the correct English tense inflection. Some examples are: 

(present) 

2. yX (in the past) 

3. ^ (at present) 

(in those days). 

3.3.6 Adjectival -Advarbial Tag 6 

There are certain important Chinese verbal indicators for which English does not have any 
equivalent single words. For the present, they are not translated, but as the linguistic system is 
refined. It is hoped that the correct translation may be decided on. Examples of such words are: 

, and . 

The major linguistic classes that overlap class A are A/N, AD/V, and AJ/V. The A/N’s are 
introduced by the nominals. The AD/V’s are Chinese words that function as verbs when they are 
not followed by a verb, but become attributes of verbs when they precede verbs in context. 
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Instead of classifying such words as VT£ or VTD, it is much more convenient to classify them as 
AD/V. Some examples of such words are: 

1. ^ ^ (diligently, work hard) 

2* ^ ^ (continuously, continue) 

3. (separately, separate). 

The following linguistic rules apply: 

1. AD/V + V - AD + V 

2. AD/V + $ V* V + $V 

Class AJ/V includes all words ending with flLi , such as (industrialized) and 

^ Jtft^(t,(fascistize), and other words iikej^^' (progressive) and (solid). 

3.4 '.^RBALS 

hi the course of ti.is study, the verbal class has been found to be the most interesting and the 
most challenging, since it plays such an important role in sentence structure. Aside from its 
primary function as the main verb in a sentence, the Chinese verb alters its form depending on the 
■phrase structure in which it is used. Several problems arise from this peculiarity of the Chinese 
verb: 

1. There is no inflected form for conjugation with person, number, or tense. The tense and 
aspect can be specified by adverbial of time indicators and by other indicators, such as auxiliaries 
and converbals. 

2. hi general, there is no distinction in Chinese verbs bet^'een transitivity and intransitivity. 
With the exception of a limited number of verbs that do not take objects, most verbs can take 
objects or their equivalent. It is therefore necessary to distinguish the types of object and indirect 
object that the verb may tal:e. 

3. The verb plays an important part in influencing the translation of certain connominals 
(prepositions). There are several connominals whose translations change according to the verb 
used.. 

4. The bifunctional or even multifunctional nature of some verbs calls for different classifi~ 
cations for the same word. 
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♦The symbol $ means “anything except....” 





5. The verbal subtags concerning Chinese and English grammatical functions must both be 
retained in the linguistic processing for optimum phrase and sentence structure analysis. As the 
linguistic research progresses, more tags may have to be added for further refinement of the 
linguistic system. 

3.4.1 Verbal Subtags 

The verbals have 16 subtags, as shown in I'able 3-5. Tags 1 through 11 are Chinese gram- 
matic tags, and are included in the dictionary entry for each verb. Tags 12 through 16 are English 
grammatic tags, and are inserted through linguistic processing by machine operation. 

Verbal Tag 1. This tag identifies the major grammatic verbal class (V)j ’ 

Verbal Tag 2. This tag identifies the verbal t3^es, of which there are five: transitive (T), 
intransitive (I), transitive and intransitive (B), special (S), and verbs of adverbial quality (W). 

Verbal Tag S. This tag identifies the object of the verb. The object includes the material that 
immediately follows the verb and/or that is grammatically important to sentence structure. The 
object might therefore include verbs and embedded sentences. When the object is in verb form, its 
English equivalent is usually in infinitive form. 

Verbal Tag 4. This tag identifies the complement that follows the object. Because of its 
position, a word that is ordinarily called an indirect object is considered for machine purposes to 
be a complement. 

Verbal 5. This tag is used to indicate a preiransitive, a type of connominal that functions as 
an indicator of the object before the verb, such as|tJ^, . An e.xample is:f^ 

(please give him this book). 

Tags 6 through 11 are called connominal tags. Several very common connominals change their 
translation according to Lndividual verbs. For example, can be translated seven ways and 
can be translated six ways, depending on the verb. Seven kinds of connominals are therefore in- 
cluded so that the proper translation of the connominals can be indicated by the verb tags. 

Verbal Tag 6. This tag is called the connominal -C tag. It deals with the connominal 
This connominal can be translated seven ways: 

1. Translate as “to”: ^ HUJ say to him) 

2, Translate as “ concerning”: ^ expresses agreement con- 

cerning this matter) 
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3. Translate as “ in”: ^ ^ am Interested very much in machine 

translation) 

4. Translate as “with”: (India takes this condition to com- 

promise with Jled China) 

5. Translate as “ against”: 4^ llj 8 -f-f ^ (the U.S. declares war against Japan) 

6. Translate as “from”: It itX^ Pit 41, (the Government coUeots income 

tax from the people) 

7. Translate as “at”: ^ (We shoot at the enemy). 

Verbal Tag 7. This tag is called the connomin;ii-D tag. It deals with the connominal ID 
(\^). This connominal can be translated six ways, depending on the verb: 

1. Translate as “to”; ^ ^ (I explain to him) 

2. Translate as “from”: ^ (I borrow a book from him) 

3. Translate as “against”: ^ (prepare war against Russia) 

4. Translate as « toward”: <C. (they walk toward the school) 

5. Translate as “at”: &3 i'^^‘!^capitalists voar at their subordinates) 

6. Translate as “ with”:-Ji;(^ ^ ^ t|' (I conciliate this matter with him). 

Verbal Tag §. This tag is called the connominal -K tag. It deals with the connominal IK i^) 
in the postverb position, and it can be translated in two ways: 

1. Translate as “into”: *1) ^(reconstruct China into Com- 
munistic country) ^ 

2. Translate as “as”: (we consider him as leader). 

j ^bal Tag 9 ^ This tag is not at present used for Chinese ^rammatic information, but it is 
retained for further linguistic development of verb and connominal relationships. 

Vertol Tag 10. This tag deals with connominals that have irreguir. r verbal forms. An N is 
for connominal IN (ft3{), which is translated as “ into.” An O is for connominal 10 (/f^ if 
which is translated as “as.” These are illustrated as follows: 

1. m translation in relation to verb: f ^ *4, f. (Communist 

Party organizes farmers into troops) 

2. 10 translation in relation to verb: 1 ij] i | (U.S. 

imperialist^ see Vietnam as colony). 
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Yerb^ II. This tag concerns the use of connominals lA which are 

ordiiSMiiy used as conhihctions but become conixominals when used with certain verbs. 

*• • - ' 

, Tags 12 through J6 cbncera the English functions. of fiie verb. The Ei^Iish^orais. of the verb 
are produced during the linguistic loop operations, i.e., during machine operations and lable look- 
up, which add to or refine the English t^s during the passes that generate linguistic luieiookup. 

Verbal Tag 12 . This tag gives .the main forms 'of the verlO— main verb and negative verb 

(Y). ; 

m r V > Vi • 

• Verbal Tag 13. This tag indicates person— first, second, or third— infinitive form of the 
verb (I), present participle form (P), past participle form (D), and auxiliary form (G). 

Verbal Tag 14. This tag indicates number — singular (S) or plu^ (P). 

Verbal -Tag 15. This tag indicates tense— present (P), present perfect (E), past (A), past 
perfect (S), future (F), future perfect (U), or present progressive (R). 

Verbal Tag 1^. This tag indicates i^oice— active (A) or passive (P). As the linguistic analysis 
progresses, more form indicators may be added. A separate section on the English output is in- 
cluded in Appendix C. 

Following is a discussion of the four 7.ypes of verbs so' far encountered. 

3.4.2 Intransitives 

There are four kinds of situations /ji which we consider the verb to be intransitive: 

' • • " - * * 

1. Whena Chinese verb is used that truly takes no object, suchas (arrive) and 

(explode). 






2. Wlien a. Chinese verb is us< 2 d that is intransitive in the active voice, and whose English 
equivalent must be in. the passive voice. For example, the verb^i. would be “be born” in 
English. 

3. hi Chinese the object is included in the verb, while the English equivalent is one word. For 

example, the verb'f'i^ is actually composed of verb and object in Chinese, while tlie English 
equivalent is one word, “rain.” To save computer operation and lookup, T' (rain heavily), 

*T' (rain lightly), and T' ^ (drizzle) are also considered as single .unit words by the 

longest match. As computer operation becomes more sophisticated, the division of these words 
for linguistic refinement will be considered. Other examples include such words as^^^^isovj), 

^ i'* (read), (walk), (work). 
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